neural networks

您所在的位置：网站首页 › tried tested ways to merge two facebook pages › neural networks

neural networks

2023-04-09 15:05| 来源: 网络整理| 查看: 265

For a classification task (I'm showing a pair of exactly two images to a CNN that should answer with 0 -> fake pair or 1 -> real pair) I am struggling to figure out how to design the input.

At the moment the network's architecture looks like this:

The conv layers have a 2x2 stride, thus halfing the images' dimensions. I would have used the first fully-connected layer as the first layer, but then the size of it doesn't fit in my GPU's VRAM. Thus, I have the first conv layers halfing the size of the images first, then combining the information with a fully-connected layer and then doing the actual classification with conv layers for the combined image information.

My very first idea was to simply add the information up, like (image-1 + image-2) / 2...but this is not a good idea, since it heavily mixes up image information.

The next try was to concatenate the images to have one single image of size 400x100 instead of two 200x100 images. However, the results of this approach were quite unstable. I think because in the center of the big, concatenated image convolutions would convolve information of both images (right border of image-1 / left border of image-2), which again mixes up image information in not really senseful way.

My last approach was the current architecture, simply leaving the combination of image-1 and image-2 up to one fully-connected layer. This works - kind of (the results show a nice convergence, but could be better).

What is a reasonable, "state-of-the-art" way to combine two images for a CNN's input?

I clearly can not simply increase the batch size and fit the images there, since the pairs are related to each other and this relationship would get lost if I simply feed just one image at a time and increase the batch size.

【本文地址】

neural networks

neural networks

今日新闻

推荐新闻